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ABSTRACT 

We study a scenario for route planning in road networks, 
where the objective to be optimized may change between 
every shortest path query. Since this invalidates many of the 
known speedup techniques for road networks that are based 
on preprocessing of shortest path structures, we investigate 
optimizations exploiting solely the topological structure of 
networks. We experimentally evaluate our technique on a 
large set of real-world road networks of various data sources. 
With lightweight preprocessing our technique answers long¬ 
distance queries across continental networks signihcantly 
faster than previous approaches towards the same problem 
formulation. 


1. INTRODUCTION 

Road networks of large geographic regions such as Europe 
or the U.S. easily consist of hundreds of millions of nodes, and 
collaborative spatial data collection efforts, such as Open- 
StreetMap (OSM) [^, have seen growths in node size by two 
orders of magnitude over the last years. On such large net¬ 
works, Dijkstra’s classical shortest path algorithm incurs 
substantial running times of several seconds even on modern 
computer hardware. This is too slow for many applications 
such as navigation, route planning, location-based services, 
range and trajectory queries, k-nearest-neighbor search, and 
other queries on spatial network databases. Hence, the past 
decade has seen numerous research (by both theoretical and 
applied communities) into techniques that accelerate shortest 
path queries. For an overview see the recent surveys [3 33 


Assuming that the graph metric is fixed or does not change 
too often, these techniques offer very fast queries at consid¬ 
erate preprocessing effort, enabling route planning services 
that serve millions of users per day. However, if instead costs 
change for every query, these techniques cease to provide 
benefit over Dijkstra’s algorithm. Yet, in practice, even the 
same user might prefer a quickest route in the morning but 
a safe and fuel-efficient route back home. 




(a) OSM Input 


(b) DIMACS Input 




(c) OSM Biconn. Comp. (d) DIMACS Biconn. Comp. 



(e) OSM TopoCore 



(f) DIMACS TopoCore 


Figure 1: OSM (left) and DIMACS (right) data 
sources of the area with a longitude in [8.50103, 
8.52117] and latitude in [48.9476,48.9596]. Nodes 
are drawn at geographical position. Arcs are drawn 
without direction for clarity. Non-core nodes are 
red. Nodes not in the largest biconnected compo¬ 
nent are grayed out. Nodes in the TopoCore (see 
Section]^ are green. 






This scenario is considered in Personalized Route Plan¬ 
ning (PRP), a problem that was recently introduced in a 
VLDB best paper [^. Here, every arc in the road graph is 
associated with a vector c of several non-negative numeric 
costs such as for example travel time, distance, speed, emis¬ 
sions, and energy consumption. The input of a query, in 
addition to the source and the target node, consists of a cost 
vector w with non-negative entries. In the search, every arc 
is associated with the scalar product of w and c. The output 
consists of the shortest path with respect to this weighted 
sum of costs. Solving the PRP problem efficiently seems 
very useful in order to construct route planning services that 
adapt to the individual needs of every person. 

Unfortunately, in practice not all routing constraints can 
be modeled as a linear combination of additive costs. For 
example, summing up height limitations is not meaningful 
(i. e., a 3 m high truck will not ht through two consecutive 
tunnels of 2 m height). A similar observation holds for vehicle 
weight limitations or the limit on the maximum slope that 
a vehicle can climb. Further constraints are the avoidance 
of certain road categories, such as for example highways, 
city centers, or water conservation zones (which trucks with 
dangerous goods are not allowed to traverse). In this work, 
we generalize PRP to also support such restrictions. 


1.1 Related Work 


The classic solution to solving shortest path problems on 
road networks is Dijkstra’s algorithm |14| . Slightly faster 
queries are achieved by employing bidirectional search from 
both source and target [5 30 . Furthermore, A* (or heuristic) 
search pE 30 using easily available bounds (e.g., euclidean 
distance) is still a common choice. However, some stud¬ 
ies, such as [23| , have come to the conclusion that on road 
networks. A* with euclidean distance bounds is not necessar¬ 
ily benehcial over Dijkstra’s algorithm; it can even slightly 
decrease efficiency. We have witnessed similar behavior in 
preliminary experiments in our specific setting. 

Many techniques have been proposed for further accelera¬ 
tion. Nearly all of these divide the work into two phases: In 
a preprocessing phase the graph is augmented with auxiliary 
data that is then exploited during the query phase for faster 
shortest path or distance retrieval. A good overview of tech¬ 
niques is gi ven i n [3 33 . Examples are graph partitio n-based 
techniques m |16[ |32| , landmark-based A* (ALT) [15| |23 


Contraction Hierarchies |21||27| , and Hub Labeling pj, the 
latter of which can be implemented on a DBMS [^. 

Above techniques work on the common assumption that 
costs are known during the preprocessing phase. Since the 
preprocessing effort is substantial, this can have a deterrent 
effect for real applications. Hence, techniques have been 
proposed that further subdivide the preprocessing phase, 
resulting in tool chains that relatively quickly customize the 
preprocessing to new costs |13| , boiling them down to a 
singled fixed scalar cost to be considered by queries. Employ¬ 
ing heavy parallelization on multi-core machines—or even 
multiple GPUs —these techniques achieve customization 
times faster than a single Dijkstra query. However, if costs 
change for every query, spending so much computational 
effort seems questionablelj This is the case for the scenario 
considered in o ur w ork: Personalized Route Planning (PRP), 
introduced by [17], where it is approached based on k-path 


^in a server setting, such resources could serve other clients 
in parallel; in a client setting they might not be available 


covers. A fc-path cover C is a small node subset of the orig¬ 
inal graph such that any simple (loop-free) path contains 
at most k — 1 successive nodes that are not in C. The core 
idea for accelerating PRP queries consists of computing a 
coarsened path that only contains nodes in C where possible. 
Unfortunately, computing a minimum fc-path cover is NP- 
hard [^. For this reason in approximate solutions were 
used. Note that the fc-path cover approach is inspired by the 
k-skip covers introduced in |34| . The main difference is that 
fc-skip covers only guarantee that any shortest (i.e., w.r.t. a 
fixed scalar cost) path contains at most k — 1 successive nodes 
not in C. The concept of fe-skip covers is related to shortest 
path covers [^, which have been used to show worst-case 
bounds for many speedup techniques (on graphs with small 
shortest path cover size). 

The PRP problem is essentially a high-dimensional, lin¬ 
ear multi-criteria search problem, related to the parametric 
shortest path problem. Extensions of known preprocessing 
techniques to multi-criteria optimization have been proposed, 
but were only evaluated experimentally for the bi-criteria 
and tri-criteria case. Even for the three criteria of travel 
time, travel distance, and fuel consumption (which are even 
quite correlated), diminishing returns in terms of query speed 
over preprocessing effort have been rep orted |18| . Related 
approaches include Pareto-SHARC [10] , which drops exact¬ 
ness in its practical variant, and Contraction Hierarchies 
with edge restrictions [20] . 

1.2 Our Contribution 

The primary results of our work are: 

• We generalize Personalized Route Planning (PRP) to 
support a more rich set of restrictions. The generaliza¬ 
tion allows to model, for example, maximum vehicle 
heights (e. g., for tunnels) and maximum vehicle weights 
(e.g., for bridges) as well as user-preferences such as 
avoidance of highways. 

• A new preprocessing-based algorithm for PRP, extend¬ 
ing the bilevel Dijkstra of [^. While we build on basic 
and easy to implement concepts, in combination our 
approach is better at PRP than the state-of-the-art. A 
key ingredient is efficient identification of topologically 
important core nodes, while preserving all (not just 
shortest) paths. Figure shows aspects of our con¬ 
struction, which is computed optimally in time linear 
in the size of the input graph. 

• We conduct an extensive experimental study on a large 
set of real-world road graphs of different data sources. 
Our algorithms achieve significantly faster personalized 
route planning queries than previous approaches at 
less preprocessing costs. Furthermore, our query times 
are well below one second even on the largest instance 
tested for random long-distance queries. This is fast 
enough for a wide range of applications. Note that in 
practice most queries are short-distance that result in 
even lower query times. 

• Our analysis further shows that performance gains 
significantly vary depending on the data source—as 
opposed to just the geographical instance considered. 
While observed before, overall it is surprisingly under¬ 
reported in the literature on route planning in road 


















networks. We conclude that ranking road networks just 
by node count is not meaningful, and cross comparisons 
of the performance of route planning techniques are 
inconclusive without careful consideration of the re¬ 
spective data sources used for experimental evaluation. 

1.3 Outline 

We start with basic notation in SectionIn Section]^ we 
formalize generalized arc costs supported by our approach. 
Section discusses fine-tuning Dijkstra’s algorithm, since it 
is a central search subroutine for our query algorithm. In 
Section we describe how Dijkstra’s algorithm is adjusted 
to make use of our preprocessing scheme. In Section we 
explain in detail how to precompute the TopoCore and the 
TopoCore-IS. Finally in Section]^ we report methodology, 
setup and results of our careful experimental evaluation. 

2. PRELIMINARIES 

We denote by G = {V, A) a directed graph with node set V 
and arc set A V x V. An undirected graph is denoted by 
G = {V, E) where E is the edge set. For road networks, a 
node corresponds to a position on the earth’s surface and an 
arc to a road segment between two positions. In particular, 
not every node models a road intersection. For most arcs 
{u,v) there is a back-arc (v,u). However, there are notable 
exceptions such as one-way streets or highways, which are 
modeled as two separate one-way streets. We consider multi¬ 
cost graphs, where each arc is associated with several costs, 
such as travel time or distance. Denote by k the number 
of costs. Formally, we have a function c : A —>■ R>o. An 
st-path between a source node s and a target node t, is a 
sequence su.. .vt of pairwise adjacent nodes. A graph is 
called biconnected if, after removing any node v £ V, the 
remaining graph G — n is still connected. A biconnected 
component (BCC) is a subgraph of G that is biconnected. 
An independent set 7 is a subset of V such that no two nodes 
u,v £ 1 are incident, i. e., no edge {u, v} £ E exists. 

3. GENERALIZED COSTS 

In its original formulation the PRP problem consists of 
finding a path of minimum user-specified linear combination 
of additive costs. However, this is too restrictive in practice 
as some important constraints cannot be modeled as additive 
costs. For example, one cannot simply add height limitations 
of two consecutive tunnels. Other real-world restrictions 
such as vehicle width, vehicle weight, or maximum climbing 
ability (depending on the slope) essentially fall into the same 
category: Every road has a certain threshold value (i. e., the 
tunnel height), and if the vehicle’s characteristic value (i. e., 
its height) is above this threshold, the vehicle is not allowed 
to traverse the road. Clearly, adding these threshold values is 
not meaningful, instead one needs to compute the minimum 
of thresholds: A vehicle can pass through every tunnel on 
a path, if and only if it can pass through the lowest tunnel. 
Restrictions that are formalized by upper bounds on vehicle 
characteristics are the most common. However, there also 
restrictions that result in a lower bound. An example is the 
minimum required speed on highways that bans vehicles that 
cannot go fast enough. 

Another source of restrictions is that some road categories 
are forbidden for some vehicle types. For example many city 
centers ban large trucks. Some trucks carry dangerous goods 


and are therefore not allowed in water conservation zones. 
Some drivers want to avoid highways with toll. All of these 
restrictions have in common that some roads are flagged 
and some vehicles are not allowed to traverse them. It is 
possible to regard them as 1-bit height-limitations. However, 
we prefer another view: We attach to every road a bitfield 
where the f-th bit stands for the i-th restriction of this type. 
By convention we say that a bit being set means that a road 
can be traversed. A path can be traversed if every road in it 
can be traversed. Formally this consists of computing the 
bitwise-and of all road bitfields and testing the bits in the 
result against the vehicle restrictions or user preferences. 

We support all these criteria by generalizing the PRP 
scenario. The user does not input a vector of query weights w, 
but an arbitrary function / that fulfills a set of requirements. 
We require / to map cost vectors onto a value from R>oU{oo}. 
We further need an operation o that combines two cost 
vectors. We require that o is associative, i.e., for any cost 
vectors ci, C 2 , and C 3 we require that (cioc 2 )oc 3 = 010 ( 02003 ). 
Furthermore, it must not matter whether we first combine 
two cost vectors oi and 02 and then apply /, or whether we 
first apply / to both vectors and then compute the sum of the 
results. Formally, we require that /(oi o 02 ) = /(oi) -I- /(C 2 ), 
which is the definition of / being a semigroup homomorphism. 

In the case of linear combinations, / is the scalar product 
with w, and the o-operation is the component-wise addi¬ 
tion. However, we can also do component-wise minimum 
or maximum, since it is associative, and even choose dif¬ 
ferent operations for different cost components. The right 
operation for height limitations (and similar restrictions), 
is to compute the minimum of all height limitations. The 
function / then maps the cost vector onto cxa if the vehicle 
is too high and otherwise ignores that cost component. The 
o operation for road categories is the bitwise-and operation, 
which is fortunately also associative. The function / tests 
whether a certain bit, such as the highway bit, is set or not. 
Depending on the outcome / evaluates to cxa or / looks only 
at the other cost components. 

4. TUNING DIJKSTRA’S ALGORITHM 

Dijkstra’s algorithm [14] is the textbook solution to the 
shortest path problem, and many modern techniques still 
use it as a subroutine. Fine-tuning its implementation there¬ 
fore directly results in better overall running times, but 
it also tightens the baseline for reporting speedups. (The 
speedup of a technique, which is used as an indication of 
machine-independent performance, is measured in terms of 
its query speed in relation to an implementation of Dijkstra’s 
algorithm.) See, e. g., for a detailed discussion. 

To ensure reproducibility of our experimental findings, we 
document details of our implementation and the reasoning 
behind the choices we made, as much as space allows. As 
datastructures we use an adjacency array representation of 
the graph and a 4-ary heap as queue, see for details. 

4.1 Node Orders 

Node data is usually stored as a large array and the node- 
IDs correspond to the offset in this array. A small ID- 
difference therefore implies a high likelihood that the data of 
both nodes is loaded simultaneously into the cache. Dijkstra’s 
algorithm works by accessing the memory attached to the 
two endpoints of an arc directly after another. If both are 
in cache, memory access times decreases. To illustrate this 


influence we consider three node orders as in |^: (a) random 
order, (b) input order, and (c) DFS pre-order. A random 
order performs the worst as it does not have much locality. 
The quality of the input order solely depends on the data 
source. Usually it has some locality as nodes often appear in 
the order that they were added to the dataset and adjacent 
nodes are often added successively. The DFS pre-order 
consists of picking a random root node and running a depth- 
first search. Nodes get ordered in the way they are first 
visited. Every node with pre-order ID i that is not the root 
or a leaf in the tree (i.e. the vast majority of the nodes) will 
have two neighbors with directly adjacent node IDs: The 
parent node has ID i — 1 and the first child has ID i -|- 1. 
This covers most arcs as in road networks most nodes have 
degree 3 or less. 

4.2 Bidirectional Dijkstra 

Dijkstra’s algorithm works by visiting all nodes around 
the source node increasing by distance until the target node 
is reached. A speedup can be gained by visiting the nodes 
around the source and the target node simultaneously. This 
variant is called bidirectional and was first described in |^. 
The central idea consists of running two instances of Di¬ 
jkstra’s unidirectional algorithm simultaneously. The first 
search explores the nodes close to the source node, while the 
other explores the nodes around the target node. Once a 
node is reached by both searches, a (not necessarily short¬ 
est) path is found. Denote by p the length of the shortest 
path found so far. Further denote by dp the distance of 
the next node in the forward instance’s queue and by ds 
the distance of the next node of the backward instance. We 
abort the search once dp + dp > ja, a.s any path that we 
And from that point on, has a distance of at least p. Several 
alternation strategies exist that decide from whic h of the 
two queues a node should be popped and processed [30| [3^ : 
The strategy alternation (alt) switches each step between 
forward and backward search. The min-key strategy (mk) 
picks the forward search if dp < dp- The min-queue-size 
strategy (mq) picks the forward search if the backward queue 
size is not smaller than the forward queue size. Note that if 
the considered graph is directed, the backward search must 
operate on the reversed graph instead of the input graph. 

5. BILEVEL VARIANT OF DIJKSTRA’S AL¬ 
GORITHM 

A bilevel Dijkstra is a preprocessing-based technique to ac¬ 
celerate shortest path queries. It is a variant of the technique 
introduced in [32| . In the preprocessing phase a core graph 
Gc = {Vc,Ac)ls computed. Think of this core graph as a 
coarsened subgraph containing all major roads. The query 
phase is a bidirectional variant of Dijkstra’s algorithm. Con¬ 
ceptually, it first searches locally around the source and the 
target nodes until the core is reached on both sides. From 
there on the search is restricted to the core graph. This 
decreases query times because Gc is smaller than G and 
therefore only parts of the graph have to be searched. 

Formally the nodes Vc of Gc are a subset of V and called 
core nodes. Determining the right set of core nodes is crucial 
for performance and detailed in the next Section The 
arcs of the core are defined as following: For every loop-free 
path viV 2 ■ ■ - Vk for which only the endpoints v\ and vt are 
in Vc and all intermediate nodes are in V\Vc, there exists 




(a) OSM TopoCore-IS (b) DIMACS TopoCore-IS 

Figure 2: OSM (left) and DIMACS (right) data 
sources, c. f. Figure The TopoCore-IS is drawn 
upon a grayed-out TopoCore, with added shortcuts 
between green nodes. 

a shortcut arc (ui, Vk) & Ac in the core graph. Note that it 
is possible that multi-arcs are created by this construction. 
The cost vector c(vi,Vk) of the shortcut is defined as the 
combination of the cost vectors of the arcs within the path, 
i.e., c(vi,Vk) = c(vi,V2) o ... o c(vk-i,Vk). 

Given a core graph we compute a forward and a backward 
search graph as follows: The forward graph Gf is the union 
of G and Gc without the arcs (u, v) that leave the core, 
i.e., u £ Vc and v £ U\Vc- The backward graph Gb is 
constructed analogously: First compute the union of G and 
Gc, then reverse the direction of every arc and Anally remove 
the arcs leaving the core. 

The query phase is a bidirectional variant of Dijkstra’s 
algorithm. The forward search is run on Gf while the back¬ 
ward search runs on Gb- We abort the search if dp-\-dp > fJ., 
where p is the tentative distance, and no queue contains a 
non-core node. 

6. COMPUTING THE CORE NODES 

In the previous section we described how a set of “good” 
core nodes is used to realize a bilevel variant of Dijkstra’s 
algorithm. In this section we describe how to compute this 
set of “good” core nodes. Initially, all nodes are core nodes. 
Then, for each node removed from the core, we potentially 
have to add shortcuts between all pairs of neighbors, in 
order to maintain shortest path distances for the yet un¬ 
known objective function (to be specified in the query). Note 
that, unlike [13] , we must create multi-arcs if an original 
arc between two neighbors is already present (since we can¬ 
not tie-break for an unknown objective function). As the 
performance of Dijkstra’s algorithm (and its bilevel variant) 
depends on both the number of nodes and arcs, we would 
eventually experience diminishing returns if adding too many 
new arcs while removing nodes from the core. 

Hence, our goal is to select as few core nodes as possible 
while restricting growth in the number of core arcs. In the 
following, we describe three steps performed in succession 
to remove nodes from the core, reducing its size and thus 
accelerating shortest path queries. We refer to the core that 
is produced after Step 2 as TopoCore. The name was chosen 
to reflect that we exploit only topological graph features. 
After Step 3, we refer to the core as TopoCore-IS, where IS 
stands for independent set. 



6.1 Step 1: Removing Dead-Ends 

First, we compute the biconnected components of the in¬ 
put graph, employing a linear-time algorithm by Tarjan |35| . 
(For this, we ignore arc directions.) Each dead-end like struc¬ 
ture is its own tiny component. All that entails signihcant 
routing decisions, forms a single large component. Hence, 
we keep every node in the core that is contained in the 
largest biconnected component. Note that we do not add 
any shortcuts in this step. 

6.2 Step 2: Removing Chains 

Consider the graph induced by all core nodes. Note that 
removing a node with only two neighbors from the core, while 
adding shortcuts between its neighbors, does not increase 
core arc size. Better yet, in our inputs, such nodes are often 
not isolated but form chains between two nodes of higher 
degree. Moreover, these chains may grow by Hrst applying 
Step 1, as intersections exist, where all but two roads lead to 
dead-ends. First removing dead-ends turns such intersections 
into degree 2 nodes. We identify such chains and add shortcut 
arcs to the core that bypass them, removing bypassed nodes 
from the core. Note that the resulting TopoCore may contain 
multi-arcs. See Figure for an illustration. 

6.3 Step 3: Removing Degree-3 Nodes 

Ideally, we would like to remove even more nodes from 
the core. In case of undirected simple graphs, removing 
a node of degree d (i. e., with d neighbors) from the core 
removes d edges (to these neighbors) from the core, while 
adding d{d — l)/2 new edges to the core, i. e., a net increase 
of d{d — 3)/2. Hence for d = 3, the number of edges in the 
core remains unchanged but the number of nodes decreases. 
It is therefore beneficial to remove degree-3 nodes from the 
core for a reduction in queue operations during search. Our 
experiments in Section]^ show, that there is an abundance 
of degree-3 nodes in the TopoCore. 

In reality, our input graphs are directed and Step 2 may 
have created multi-arcs. We deal with multi-arcs by defining 
the node degree as the number of incident arcs. Furthermore, 
for directed graphs, removing a high-degree node might not 
necessarily result in a net increase of core arcs. (For example, 
consider a node with a single in-arc: Regardless of its out- 
degree, removing the node from the core would decrease the 
number of arcs in the core by 1.) Since road networks are 
mostly undirected (i. e., most road segments can be traversed 
in both directions), we do not try to exploit such cases, i. e., 
we ignore arc directions to determine node degrees. 

Hence, the idea is to remove degree-3 nodes from the core. 
But we cannot just remove all of them, as removing a node 
may increase the degree of its neighbors, turning a degree-3 
node into a higher degree node. Therefore, we first compute 
an independent set of degree-3 core nodes (iterating over the 
nodes in DFS pre-order and greedily adding degree-3 nodes 
to the set that have no adjacent degree-3 node in the set). 
We then remove only this independent set from the core. 
See Figure for an illustration of the resulting TopoCore- 
IS. One could try to apply this procedure iteratively, but 
our experiments indicate that in the TopoCore-IS only few 
degree-3 nodes remain. 

6.4 Node Orders 

The order in which node data appears in memory has, as 
argued in Section]^ a significant impact on query speed. We 


first reorder the input graph using a DFS pre-order. We then 
compute the core and move core nodes to the front of the 
order. This yields DFS pre-order inside of the core. Outside 
of the core the nodes also have an order that locally behaves 
DFS-like. The arcs bridging the largest node-ID differences 
tend to be arcs entering or leaving the core. 

7. EXPERIMENTS 

7.1 Setup and Methodology 

We implemented our algorithms in C-|—1-, compiling on 
g-l—h 4.6.3 with optimization level -03. Our experiments 
were performed on a single core of an Intel Xeon E5-2670 
processor (Sandy Bridge architecture) clocked at 2.6 GHz, 
with 64GiB of DDR3-1600 RAM clocked at 1.6 GHz, 20MiB 
of L3 and 256 KiB of L2 cache. 

We use five different road networks of three different origins 
as our test instances. Tablej^reports basic statistics. Figurej^ 
depicts the geographical regions represented by the graphs. 
The two DIMAGS instances were published for the 9th DI- 


Table 1: The sizes of our benchmark graphs. We re¬ 
port the number of nodes \ V\, the number of arcs \A\, 
and the node degree distribution. 



1^1 



1^1 


OSM-BaWii 

3064K 


6184K 


OSM-Ger 

20 690K 


41792K 


OSM-Eur 

173 789K 


347997K 


DIMACS-Eur 

18010K 


42 189K 


DIMACS-US 

23 947K 


57709K 




# Nodes per degree 



1 

2 

3 

4 

5-b 

OSM-BaWii 

13.3% 

72.6% 

12.6% 

1.2% 

0.01% 

OSM-Ger 

14.2% 

70.9% 

13.5% 

1.3% 

0.01% 

OSM-Eur 

12.1% 

76.7% 

10.1% 

1.1% 

0.01% 

DIMAGS-Eur 

26.5% 

18.7% 

49.1% 

5.7% 

0.1% 

DIMAGS-US 

19.9% 

30.3% 

39.0% 

10.7% 

0.1% 




(d) DIMAGS-US (e) OSM-Eur 

Figure 3: The geographical regions corresponding 
to our benchmark graphs. 


(a) OSM-BaWii (b) OSM-Ger 


(c) DIMAGS-Eur 









MACS implementation Challenge [12] , DIMACS-Eur was 
compiled from NAVTEQ data and kindly made available 
by PTV AG [^, it includes the road networks of 17 Western 
European countries. DIMACS-US was derived from the UA 
Census 2000 TIGER/Line Files produced by the Geography 
Division of the US Gensus Bureau. The OSM instances were 
obtained from http://download.geofabrik.de/ at 2014-10- 
23T20:22:02Z, courtesy of GeoFabrik GmbH [22| . From 
that data, we compiled our routing networks using the 
graph extraction tools provided by OSRM [26| with the 
“car” prohle. More precisely, we used this version of the 
code: https://github.com/Project-OSRM/osrm-backend/ 
tree/6f75d68d07a5dla67219835a0638cd0a482al8f5 OSM- 
BaWii is the road network of the state of Baden-Wiirttemberg 
in Germany, OSM-Ger that of Germany. OSM-Eur contains 
the road networks of 48 European regions, including west¬ 
ern Russia. We remove multi-arcs from the input and only 
keep the largest strongly connected component to assure 
that between each pair of nodes at least one shortest path 
exists. The numbers in Table are the graph sizes after 
these standard cleanup procedures were applied. Still, our 
OSM graphs are larger than those reported in [17| ; We sus¬ 
pect that the OSM data we use is more recent and therefore 
contains more details. Note, however, that our graphs have 
a very similar average degree (which for a given data source, 
i. e., OSM in this case, indicates a similar degree distribu¬ 
tion) and should therefore behave similarly. For future ref¬ 
erence, we have made our OSM instances publicly available 
under http://illwww.iti.uni-karlsruhe.de/resources/ 
roadgraphs .php in the same format as used in the DIMAGS 
challenge. The DIMAGS instances are available under http: 
//www.dis.uniromal.it/challenge9/download.shtml 

We evaluate the performance of our algorithm with respect 
to the basic and the generalized PRP problem. For the basic 
PRP problem we attach cost vectors with 8 entries to each arc 
(as chosen for the largest graph evaluated in [^). Each cost 
entry is a 32-bit int. Each of the test instances provides travel 
time t for each road segment, and we infer a road distance d 
from the geographical positions of the segment end points. 
Unfortunately, we do not have any further road metric that is 
available on every instance. We therefore generate 6 further 
costs per arc: lOOt/d, lOOd/t, 100/d, 100/t, 1, and a random 
number between 0 and 100. Notice that none of these costs 
is a linear combination of the other costs. We therefore have 
a sufficiently diverse structure to get meaningful results. For 
the generalized PRP we also have 8 costs but only the first 4 
are additive. These are t, d, lOOt/d, and lOOd/t. The last 4 
are thresholds such as needed for height limitations. As we 
do not have real world data available we generate synthetic 
data. For every arc a and cost c we throw a 1000-sided dice. 
If it lands on 0, we attach a random threshold between 0 
and 100 to the cost c of the arc a. If the dice lands on any 
other number we assign a threshold of -|-cx3. Note that we 
assign -|-oo-thresholds with such high probability, in order 
to ensure connectivity of the graph. 

For all query time experiments we sampled 1000 uniform 
random source and target pairs. Note that uniform ran¬ 
dom queries are long-distance queries with high expectancy. 
Typically, most queries issued on real systems, e.g., naviga¬ 
tion devices, are short-range queries and should be answered 
faster. We make sure that queries are the same for different 
node orderings of the same graph (by permuting the pairs 
according to the node ordering instead of picking a new inde¬ 


xable 2: Preprocessing time in seconds. “BCC” is 
the time needed to compute the biconnected compo¬ 
nents. We also report the time needed to randomly 


reorder all nodes and 
vectors in memory. 

their 

incident 

arcs and cost 

Reorder Nodes 

BCC 

Insert Shortcuts 

OSM-BaWii 

1.2 

0.8 

0.7 

OSM-Ger 

22.8 

6.7 

5.8 

OSM-Eur 

304.0 

150.1 

202.8 

DIMACS-Eur 

22.1 

7.2 

6.5 

DIMACS-US 

25.3 

9.1 

8.3 


pendent set of 1000 random pairs). We further pick a query 
weight w of 8 random entries between 0 and 100 for each 
query. For the generalized PRP problem we interpret the 
last 4 entries as vehicle characteristics that must be below 
a threshold (such as for example the vehicle’s height). To 
avoid overflows all computations are done using 64-bit inte¬ 
ger arithmetic. Our implementation of Dijkstra’s algorithm 
stores 64-bit tentative distance values for each node. It uses 
a 4-ary heap as queue. 

7.2 Preprocessing 

In Table we report the time needed by our preprocessing. 
Gomputing the biconnected components and computing the 
shortcuts are the most expensive algorithmic tasks. How¬ 
ever, as the table shows, its running time is dominated by 
seemingly unsophisticated operations such as permuting all 
nodes in-memory. The reason is that the cost vectors need 
a lot of space (32 Byte per arc) and need to be reordered 
as well. For example for OSM-Eur the arc cost data alone 
needs over |A| • 4 • 8 >10 GB of RAM. Shuffling memory 
is therefore a comparatively expensive task. We therefore 
expect that in a productive implementation the running time 
is not dominated by purely algorithmic aspects but parsing 
the input data should dominate. 

Table m details the sizes of the various obtained cores. The 
first step of removing the nodes not in the largest biconnected 
component decreases the node counts by roughly 30% for all 
graphs. How effective removing degree-2 nodes is depends 
on the graph. For the OSM graphs core sizes decrease by a 
factor of 8 in terms of nodes. The size decrease for DIMAGS- 
US is only a factor 2 and for DIMACS-Eur it is even only 
40% less nodes. Removing degree-3 nodes further decreases 
the node count by 40%. As expected the number of arcs 
does not decrease significantly in this final step. 

Besides core sizes we also report in Table the average 
number of arcs in the degree-2 chains removed from the 
graph. A chain is a sequence of at least 2 arcs where all 
intermediate nodes have degree 2. Note that we first compute 
the biconnected components (BCC) before computing the 
chains. This order increases the chain lengths increasing the 
effectiveness of our technique. Again the numbers show that 
the OSM graphs have more degree-2 nodes and thus longer 
chains. We further report the number of degree-3 nodes. 
As expected this number significantly decreases when going 
from TopoCore to TopoCore-IS. 

Memory Consumption. 

Suppose that the input graph has n nodes and m directed 





Table 3: Core graph sizes. We also report the number of nodes and arcs of each core in percent of the input 
graph’s number of nodes respectively arcs. 




Input 

BCC 


TopoCore 

TopoCore-IS 

OSM-BaWii 

IV^I 

3064K 

2 095K 

68.4% 

270K 

8 . 8 % 

161K 

5.3% 


1^1 

6184K 

4 489K 

72.6% 

777K 

12 . 6 % 

730K 

11 . 8 % 

OSM-Ger 

IV^I 

20 690K 

14 088K 

68 . 1 % 

1887K 

9.1% 

1125K 

5.4% 


1^1 

41792K 

30 267K 

72.4% 

5 430K 

13.0% 

5 088K 

12 . 2 % 

OSM-Eur 

IV^I 

173789K 

116 232K 

66.9% 

13 957K 

8 . 0 % 

8 414K 

4.8% 


1^1 

347997K 

248 209K 

71.3% 

39 145K 

11 . 2 % 

36 789K 

10 . 6 % 

DIMACS-Eur 

1^1 

18010K 

11 763K 

65.3% 

7108K 

39.5% 

4 299K 

23.9% 


1^1 

42189K 

31 584K 

74.9% 

20 347K 

48.2% 

19 387K 

46.0% 

DIMACS-US 

1^1 

23947K 

16 020K 

66.9% 

7415K 

31.0% 

4 789K 

20 . 0 % 


1^1 

57709K 

41412K 

71.8% 

24201K 

41.9% 

23 754K 

41.2% 


Table 4: The average number of arcs per degree-2 
chain and the remaining number of degree-3 nodes. 



Avg. 7 /arcs 
per chain 

Number of degree-3 nodes 
TopoCore TopoCore-IS 

OSM-BaWii 

7.2 

249K 

20K 

OSM-Ger 

6.9 

1738K 

137K 

OSM-Eur 

8.5 

12 741K 

1478K 

DIMACS-Eur 

2.7 

6 435K 

560K 

DIMACS-US 

3.2 

5 481K 

40K 


Table 5: Input graph size and additional memory 
needed by TopoCore and TopoCore-IS for k = 8. 

Graph 

Input 

TopoCore 

TopoCore-IS 

OSM-BaWii 

224MB 

28MB 

26MB 

OSM-Ger 

1 514MB 

194MB 

179MB 

OSM-Eur 

12 610MB 

1 397MB 

1 295MB 

DIMACS-Eur 

1 517MB 

726MB 

682MB 

DIMACS-US 

2 073MB 

859MB 

834MB 


arcs and that the core graph has ric nodes and rric arcs. 
Further there are k costs and each ID and cost entry is 
encoded using 32-bits. To store the structure of input graph 
in an adjacency array 4(n-|-l)-|-4m bytes are needed. The cost 
vectors need another 4fcm bytes of storage. The total space 
required by the input graph is thus 4((n -|- 1) -|- (fc -I- l)m). 
Similarly the total additional space required by the core 
graph is 4((nc -f 1) -I- (fc-f As we reorder all core nodes 

to the front, we do not need to explicitly store which nodes 
are core nodes but can compare the node ID to ric. Table 
depicts the memory consumption for all benchmark graphs. 

7.3 Query 

Table compares the performance of Dijkstra’s algorithm 
in its unidirectional and bidirectional variants and with 
all three node orders. Overall, bidirectional search with 
minimum-queue-size alternation strategy yields the best 
query performance, consistently about 55 % faster than unidi¬ 
rectional search. Additionally, DFS-reordered nodes improve 
query times by 19-23%, compared to the input order. 

However, we also note that the gap to unidirectional search 


on random order is much higher. This raises the question 
of what is a good baseline for determining speedups of pre¬ 
processing techniques. Especially if these techniques provide 
only comparatively low speedups (e. g., of one order of mag¬ 
nitude, because the considered scenario is so involved), it is 
very important to carefully document the baseline. While 
often undocumented, we believe that unidirectional search 
with input order is the variant used in most other studies and 
therefore use it as baseline from here on, too. (However, one 
could argue in favor of a random order, since it eliminates a 
dependency on the data source, which might or might not 
provide a good input order.) 

In Table we report the running times of our query algo¬ 
rithm on both variants of the PRP problem. We observe that 
the running times are very similar for both problems. We 
conclude that the running time is bounded by the work done 
by Dijkstra’s algorithm and not the time needed to evaluate 
the costs at the edges. On graphs with an abundance of 
degree-2 nodes (such as OSM) we achieve large speedups of 
approximately 30-55. On graphs with fewer degree-2 nodes 
the results are less impressive but the speedups of about 
6 . 2 - 8 .5 is still a significant improvement over the baseline. 

Data Source Dependent Speedups. 

The experimental results presented in Table show that 
speedups achieved by our technique are significantly higher 
on OSM-based graphs (by a factor of up to 51.8/6.2 = 8.4). 
This is due to the signihcantly higher number of degree-2 
nodes in these graphs, c. f. Table[2 One may wonder whether 
this is a shortcoming of our technique. 

To the best of our knowledge, not many techniques have 
been evaluated on both OSM and non-OSM graphs, with the 
notable exception of |^, which has observed a similar effect: 
The speedup of their technique over Dijkstra’s algorithm is 
up to 14.2 times higher on OSM than on non-OSM graphs^ 

These and our results suggest that OSM-based graphs are 
in some sense easier for speedup techniques compared to 
graphs with the same number of nodes but from other data 
sources. This needs to be considered in the comparison of 
different route planning techniques experimentally evaluated 

^They report speedups of 6 093 ms/1.67ms = 3 649 on 
DIMACS-Eur, 6124ms/1.61 ms = 3 804 on DIMACS- 
US, 17 750 ms/1.98 ms = 8 965 on Bing data, but 
77121ms/1.49ms = 51759 on their largest OSM graph. 
(Considering a route planning scenario different from ours.) 













Table 6: Query running time and number of queue- 
pop-operations for variants of Dijkstra’s algorithm 
on the OSM-BaWii graph for the general PRP prob¬ 
lem. “random”, “input” and “dfs” are the node 
orders considered. They vary in terms of running 
time because of cache-effects but not in terms of 
pop-operations, “mk”, “alt” and “mq” are the al¬ 
ternation strategies. 


Dir 

Time [ms] 
Random Input 

DFS 

Nodes popped 
from queue 

uni 

470 

265 

223 

1539K 

bi-mk 

371 

216 

176 

1009K 

bi-alt 

343 

188 

156 

938K 

bi-mq 

302 

171 

143 

900K 


on road networks of different origin. 

Additional Cost Components. 

So far we have experimented with 8 cost components of 
32 bits each. However, some applications might require 
longer cost vectors. We therefore perform additional query 
experiments on OSM-Germany with TopoCore-IS. For these, 
we pad the existing cost vector with 8 components to 16, 
32, and 64 components of 32 bits by adding random costs. 
Table reports the average number of queue pop operations 
and running time. The former is almost unaffected by the 
number of cost components. However, the running time 
increases as more memory needs to be accessed. Still, our 
approach scales very well: Going from 8 to 64 components 
requires 8 times more memory, but causes only a factor 2.5 
increase in running time. 


Table 7: Query running time (T) and 

number of queue-pop-operations (P) using the 
TopoCore (TC) and TopoCore-IS (TC-IS) tech¬ 
niques and speedup (Sp.up) compared to an uni¬ 
directional baseline with input order. We use the 
min-queue-size alternation strategy. 





Input 

TC 

TC-IS 

Sp.up 

OSM 

T 

[ms] 

265 

14 

9 

29.4 

-BaWii 

P 

[•lol 

1539 

80 

48 

32.1 

OSM 

T 

[ms] 

2 914 

118 

80 

36.4 

-Ger 

P 

[•lol 

10 313 

599 

357 

29.9 

OSM 

T 

[ms] 

32 145 

891 

621 

51.8 

-Eur 

P 

[•lol 

83 938 

3 761 

2 266 

37.0 

DIMACS 

T 

[ms] 

1817 

424 

291 

6.2 

-Eur 

P 

[.10^1 

9015 

1976 

1195 

7.5 

DIMACS 

T 

[ms] 

3 045 

523 

381 

8.0 

-US 

P 

[•lol 

11912 

2 339 

1513 

7.9 

(a) Basic PRP Problem 




Input 

TC 

TC-IS 

Sp.up 

OSM 

T 

[ms] 

258 

14 

9 

27.7 

-BaWii 

P 

[•101 

1504 

80 

48 

31.5 

OSM 

T 

[ms] 

2997 

121 

86 

34.8 

-Ger 

P 

[•101 

10229 

595 

354 

28.9 

OSM 

T 

[ms] 

32088 

781 

558 

57.5 

-Eur 

P 

[•lol 

77933 

3207 

1928 

40.4 

DIMACS 

T 

[ms] 

2024 

408 

279 

7.3 

-Eur 

P 

[•lol 

8965 

1906 

1153 

7.8 

DIMACS 

T 

[ms] 

3260 

512 

386 

8.5 

-US 

P 

[•101 

11885 

2323 

1502 

7.9 


(b) Generalized PRP Problem 


Table 8: Query performance with varying number 
of cost components on OSM-Ger with TopoCore-IS. 


# Costs 

8 

16 

32 

64 

Pop [■ 

■lol 

357 

354 

348 

340 

Time 

[ms] 

80 

108 

132 

198 


7.4 Comparison with Related Work 

While there is vast literature on route planning in road 
networks, most works consider query scenarios different from 
ours, making any direct comparison difficult. We identify 
three classes of approaches related to the Personalized Route 
Planning (PRP) scenario considered in our work: (1) adap¬ 
tations of preprocessing techniques originally designed for 
fixed scalar costs, such as extensions of Contraction Hierar¬ 
chies (CH) |21] that support multiple criteria 19 and 
arc restrictions (e. g., “avoid highways”, vehicle weight limits, 
etc.) [20] , or such as Pareto-SHARC (2) Customizable 
Route Planning approaches 1^ 1^ |13| ; (3) previous Personal¬ 
ized Route Planning approaches |17j . We report a detailed 
comparison of these approaches in Table 

While plain CH (single fixed criterion, i. e., travel time) 
yields query times more than three orders of magnitude faster 
than ours, performance quickly degrades when considering 
arc restrictions or multiple criteria: While exact compar¬ 
isons are difficult due to differences in benchmark instances, 
one roughly observes that considering arc restrictions as 
well as each additional criterion considered each decrease 
query speed by about an order of magnitude (0.152 ms —>■ 
1.18ms, 0.152ms —>■ 0.98ms, 0.42ms —>■ 3.16ms). For three 
(somewhat correlated) criteria (distance, travel time, and 
fuel costs), CH performance on OSM-BaWii is already only 
factor 3-9 faster than for ou r ap proach in terms of query 
times and reported speedup |18| . This degradation of per¬ 
formance for more than two criteria likely means that the 
Contraction Hierarchies approach does not extend well to 
the PRP scenario considered in this work (an assessment 
also made by |17| ). A similar, even stronger argument can 
be made against extending Pareto-SHARC for PRP. 

Customizable Route Planning (CRP), introduced by 1^, is 
closely related to PRP. However, instead of considering user 
preferences and restrictions as an input to each query, the 
cost of each arc (in the input graph as well as shortcuts) is 
established in a relatively quick customization phase. In this 
phase, combinations of different criteria as well as restrictions 
(or live traffic delays) may be considered, but then, each 
subsequent query works on a single-criterion fixed metric. 
The original publication on CRP uses multi-level overlays 
and shortcuts [^, whereas CCH |13| is an adaption of CH 
to the customization setting. In |24| a better contraction 
order computation strategy is introduced resulting in the 
numbers of Table [^ Directly applying both these techniques 
to PRP (by paying customization time for every change in 
















Table 9: Comparison to related work. We report the number of criteria Crit.) considered by each 

approach, the instance (in name and size) on which it was evaluated, the preprocessing time required, and 
the query time and speedup (over Dijkstra’s algorithm) achieved. Where applicable we report customization 
time. We note if figures do not apply (—) or have not been reported (n/a). All timings are sequential, except 
for the GPU extension of CRP. CRP techniques were evaluated on an instance augmented with artificial U- 
turn costs. Differences in OSM graph size of the same instance are, to the best of our knowledge, due to 


different extraction dates. 


Algorithm 



# Crit. 

Instance 

|V| 

[TO®] 

|A| 

[TO®] 

Prepro. 

[h:m:s] 

Custom. 

[ms] 

Query 

[ms] Speedup 

CH 21 





1 

DIMACS-Eur 

18.0 

42.2 

2:45 

— 

0.152 

n/a 

CH, W 

le restrictions 20 

1 

NAVTEQ-US/CA 

21.1 

52.5 

7:21:00 

— 

1.18 

2 935 

Pareto-SHARC 1 

10 

2 

DIMACS-Eur 

18.0 

42.2 

7:12:00 

— 

35.4 

n/a 

FlexCH 

[ 

a 



2 

DIMACS-Eur 

18.0 

42.2 

5:12:00 

— 

0.98 

6183 

MultiG 

IT 

18 



2 

OSM-BaWii** 

2.5 

5.0 

2:01 

— 

0.42 

965 

MultiCH 

18 



3 

OSM-BaWii** 

2.5 

5.0 

1:08 

— 

3.16 

234 

CRP 1 





— 

DIMACS-Eur (Turn) 

18.0 

42.2 

11:53 

3 770 

1.67 

3 649 

GGH [T 


|24 



— 

DIMACS-Eur 

18.0 

42.2 

4:40:41 

2 322 

0.27 

n/a 

CRP on 


J 

1 

— 

DIMACS-Eur (Turn) 

18.0 

42.2 

28:56 

129.3 

1.17 

n/a 

k-Path Cover 

17 


8 

OSM-BaWii* 

2.2 

4.6 

12 

— 

35 

10.8 

k-Path Cover 

T7 

8 

OSM-Ger* 

17.7 

36.1 

2:29 

— 

249 

13.1 

TopoCore-IS 



8 

OSM-BaWii 

3.1 

6.2 

3 

— 

9 

27.7 

TopoCore-IS 



8 

OSM-Ger 

20.7 

41.8 

35 

— 

86 

34.8 

TopoCore-IS 



8 

DIMACS-Eur 

18.0 

42.2 

36 

— 

279 

7.3 

TopoCore-IS 



8 

DIMACS-US 

23.9 

57.7 

43 

— 

386 

8.5 


user preferences), we observe that our approach to PRP 
outperforms them both, if user preferences change with every 
or up to every 8 th query. (For perspective, recall the example 
of a fast route in the morning and a safe and fuel-efhcient 
in the evening.) While customization can be parallelized on 
multiple CPU cores [8 13 , only if it is highly parallelized on 


an external GPU |9]7^it becomes faster than our sequential 
queries. While having a GPU (for every concurrent user) is 
a strong assumption on the given computer hardware, we 
note that, even then, we achieve queries within the same 
order of magnitude (279 ms compared to 129.3 + 1.17 = 
130.47ms). Furthermore, in a server-setting, PRP-based 
approaches have no per-user memory consumption overhead 
(other than storing the objective function, if at all), whereas 
the per-user overhead for GRP and GGH depends on the 
graph size. 

Finally, for a direct comparison for the Personalized Route 
Planning scenario, we contrast our results wit h those ob¬ 
tained by the k-Path Gover approach of [17] (which in¬ 
troduced the PRP scenario). On OSM graphs our PRP 
query speedup of 27.7.-57.5 more than doubles the maximum 
speedup of 13.2 previously achieved by [17| , while having 
lower preprocessing overhead. This observation is also sup¬ 
ported by differences in absolute query runtime, even more 
so when considering the respective increase in OSM dataset 
size. Unfortunately, for their query experiments the authors 
of [17] focus exclusively on OSM graphs, hence we cannot 
compare on DIMACS graphs without speculation. 


8. CONCLUSIONS 

We evaluated a preprocessing-based speedup technique for 
faster Personalized Route Planning. On all tested instances 
- which include very large-scale networks with hundreds of 
millions of nodes - we were able to achieve running times well 


below a second. This is fast enough for many applications, 
including web services of moderate user base. The main 
advantage of the Personalized Route Planning is that costs 
are individually adjusted for every user and every query in a 
very flexible way. Rerunning preprocessing is only necessary 
when roads are build or cost vectors are adjusted (e. g., a new 
speed limit is posted). We evaluated our technique both on 
OpenStreetMap data and on datasets from the 9th DIMACS 
implementation challenge, showing that it performs well on 
a large range of instances. 
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